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Abstract. Evidential-EM (E2M) algorithm is an effective approach for 
computing maximum likelihood estimations under finite mixture models, 
especially when there is uncertain information about data. In this paper 
we present an extension of the E2M method in a particular case of incom¬ 
plete data, where the loss of information is due to both mixture models 
and censored observations. The prior uncertain information is expressed 
by belief functions, while the pseudo-likelihood function is derived based 
on imprecise observations and prior knowledge. Then E2M method is 
evoked to maximize the generalized likelihood function to obtain the 
optimal estimation of parameters. Numerical examples show that the 
proposed method could effectively integrate the uncertain prior infor¬ 
mation with the current imprecise knowledge conveyed by the observed 
data. 
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1 Introduction 

In life-testing experiments, the data are often censored. A datum Ti is said to 
be right-censored if the event occurs at a time after a right bound, but we 
do not exactly know when. The only information we have is this right bound. 
Two most common right censoring schemes are termed as Type-I and Type-II 
censoring. The experiments using these test schemes have the drawback that they 
do not allow removal of samples at time points other than the terminal of the 
experiment. The progressively censoring scheme, which possesses this advantage, 
has become very popular in the life tests in the last few years [T]. The censored 
data provide some kind of imprecise information for reliability analysis. 

It is interesting to evaluate the reliability performance for items with mixture 
distributions. When the population is composed of several subpopulations, an 
instance in the data set is expected to have a label which represents the origin, 
that is, the subpopulation from which the data is observed. In real-world data, 
observed labels may carry only partial information about the origins of sam¬ 
ples. Thus there are concurrent imprecision and uncertainty for the censored 
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data from mixture distributions. The Evidential-EM (E2M) method, proposed 
by Denoeux m. is an effective approach for computing maximum likelihood 
estimates for the mixture problem, especially when there is both imprecise and 
uncertain knowledge about the data. However, it has not been used for reliability 
analysis and the censored life tests. 

This paper considers a special kind of incomplete data in life tests, where the 
loss of information is due simultaneously to the mixture problem and to censored 
observations. The data set analysed in this paper is merged by samples from dif¬ 
ferent classes. Some uncertain information about class values of these unlabeled 
data is expressed by belief functions. The pseudo-likelihood function is obtained 
based on the imprecise observations and uncertain prior information, and then 
E2M method is invoked to maximize the generalized likelihood function. The 
simulation studies show that the proposed method could take advantages of us¬ 
ing the partial labels, and thus incorporates more information than traditional 
EM algorithms. 


2 Theoretical analysis 

Progressively censoring scheme has attracted considerable attention in recent 
years, since it has the flexibility of allowing removal of units at points other than 
the terminal point of the experiment [T]. The theory of belief functions is first 
described by Dempster [5] with the study of upper and lower probabilities and 
extended by Shafer later [6]. This section will give a brief description of these 
two concepts. 

2.1 The Type-II progressively censoring scheme 

The model of Type-II progressively censoring scheme (PCS) is described as fol¬ 
lows [T]. Suppose n independent identical items are placed on a life-test with 
the corresponding lifetimes Xi, X 2 , • • • , being identically distributed. We as¬ 
sume that Xi (i = 1,2, ■■■ ,n) are i.i.d. with probability density function (pdf) 
f{x;0) and cumulative distribution function (cdf) F(x;9). The integer J < n 
is fixed at the beginning of the experiment. The values Ri,R 2 ,--- ,Rj are J 
pre-fixed satisfying -|- i ?2 + ■ • ■ + Rj -\- J = n. During the experiment, the 
failure is observed and immediately after the failure, Rj functioning items 
are randomly removed from the test. We denote the time of the failure by 
Xj-,j:n, where J and n describe the censored scheme used in the experiment, that 
is, there are n test units and the experiment stops after J failures are observed. 
Therefore, in the presence of Type-II progressively censoring schemes, we have 
the observations {Xi,j:n, ■ ■ ■ ,Xj,j:n}- The likelihood function can be given by 

,7 

L{e;Xi,J:n, ■ ■ ■ ,Xj:J,n) = C Y[ f {Xi-.J-.n, d)[^ “ FiXi-.J:n] 9)]^% ( 1 ) 


where C = n{n — l — Ri){n — 2 — Ri — R 2 ) - • ■ {n — J-\-l — Ri—R 2 - Rj-i). 
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2.2 Theory of belief functions 

Let O — {01,02) • ■ •, 0Af} be the finite domain of X, called the discernment frame. 
The mass function is defined on the power set 2® = {A : A C 6>}. The function 
m : 2® —>■ [0,1] is said to be the basic belief assignment (bba) on 2®, if it satisfies: 

E = 1 - ( 2 ) 

AC0 


Every A £ 2^ such that m{A) > 0 is called a focal element. The credibility and 
plausibility functions are defined in Eq. m and Eq. ®. 

Bel{A)= E m{B),yAce, (3) 


Pl{A)= E m{B),yA C 0. (4) 

BnA^m 

Each quantity Bel{A) denotes the degree to which the evidence supports A, 
while PI {A) can be interpreted as an upper bound on the degree of support that 
could be assigned to A if more specific information became available [7]. The 
function pi : 0 ^ [0,1] such that pl{9) = Pl{{9}) is called the contour function 
associated to m. 

If m has a single focal element A, it is said to be categorical and denoted 
as niA- If all focal elements of m are singletons, then m is said to be Bayesian. 
Bayesian mass functions are equivalent to probability distributions. 

If there are two distinct pieces of evidences (bba) on the same frame, they 
can be combined using Dempster’s rule [5] to form a new bba: 


toi©2(C') =-^^- - - 


VC C 0, C 7^ 0 


(5) 


If mi is Bayesian mass function,and its corresponding contour function is pi. 
Let 7712 be an arbitrary mass function with contour function pl 2 . The combination 
of mi and m 2 yields a Bayesian mass function mi © m 2 with contour function 
Pi (Bph defined by 


Pi © ph 


Pi (uj)pl2(uj) 
EpeoPi(^')P^2(uj')' 


( 6 ) 


The conflict between pi and ph is fc = 1 — EpP^i^^ )ph(i^ )■ It equals one 
minus the expectation of ph with respect to pi. 


3 The E2M algorithm for Type-II PCS 

3.1 The generalized likelihood function and E2M algorithm 

E2M algorithm, similar to the EM method, is an iterative optimization tactics 
to obtain the maximum of the observed likelihood function m- However, the 
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data applied to E2M model can be imprecise and uncertain. The imprecision 
may be brought by missing information or hidden variables, and this problem 
can be solved by the EM approach. The uncertainty may be due to the unreliable 
sensors, the errors caused by the measuring or estimation methods and so on. 
In the E2M model, the uncertainty is represented by belief functions. 

Let X be a discrete variable defined on fix and the probability density 
function is px{-\d). If x is an observation sample of X, the likelihood function 
can be expressed as: 

L{0;x) = pxix]9). (7) 

If X is not completely observed, and what we only know is that x € A, A C fix , 
then the likelihood function becomes: 

L{0\A) = '^px{x\e). ( 8 ) 

If there is some uncertain information about cc, for example, the experts may 
give their belief about x in the form of mass functions: m{Ai),i = 1,2,-•• ,r, 
Ai C fix, then the likelihood becomes: 

r 

L{e-,m) = '^m{Ai)L{e-,Ai) = px{x]0)pl{x). (9) 

i—1 x^f2x 

It can be seen from Eq. that the likelihood L{0;m) only depends on m 
through its associated contour function pi. Thus we could write indifferently 
L{0-, m) or L{0;pl). 

Let W = {X, Z) be the complete variable set. Set X is the observable data 
while Z is unobservable but with some uncertain knowledge in the form oi piz- 
The log-likelihood based on the complete sample is \ogL{0-,W). In E2M, the 
observe-data log likelihood is \ogL{9-, X,plz). 

In the E-step of the E2M algorithm, the pseudo-likelihood function should 
be calculated as: 


Q{0,9'^) = f.s^[\ogL{9-W)\X,plz-,9% (10) 

where plz is the contour function describing our uncertainty on Z , and 9^ is the 
parameter vector obtained at the step. Eg*, represents the expectation with 
respect to the following density: 

-i\z = 3\X,plz-,9>^) ^ l{Z = 3 \X- 0 ^) (Bplz. (11) 

Function 7 could be regarded as a combination of conditional probability density 
7 (Z = j\X-,9'^) = pz{Z = j\X;9’^) and the contour function plz- It depicts 
the current information based on the observation X and the prior uncertain 
information on Z, thus this combination is similar to the Bayes rule. 

According to the Dempster combination rule and Eq. (jH]), we can get: 

r{Z = j\X;9^)plz{Z = j) 

Y.^r{Z = j\X-,9^)plz{Z = jY 


{Z = j\X,plz-,9Y 


( 12 ) 
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Therefore, the pseudo-likelihood is: 


Q(0,0'=) 


E, r{Z = j\X-, 0>^)pl{Z = j) log L(0; W) 

L{9>^;X,plz) 


(13) 


The M-step is the same as EM and requires the maximization of Q{0, 9^) with 
respect to 9. The E2M algorithm alternately repeats the E- and M-steps above 
until the increase of general observed-data likelihood becomes smaller than a 
given threshold. 


3.2 Mixed-distributed progressively censored data 

Here, we present a special type of incomplete data, where the imperfection of 
information is due both to the mixed-distribution and to some censored obser¬ 
vations. Let Y denote the lifetime of test samples. The n test samples can de 
divided into two parts, i.e. Yi,Y 2 , where Yi is the set of observed data, while Y 2 
is the censored data set. Let Z be the class labels and W = {Y, Z) represent the 
complete data. 

Assume that Y is from mixed-distribution with p.d.f. 

/i-(y;0)=EA,/(y;e.), (14) 

Z = 1 

where = (Ai, • • • , Ap, ^ 1 , • • • ,^p). The complete data distribution of W is given 
by P{Z = z) = and P{Y\Z = z) = f{y\iz)- Variable Z is hidden but we can 
have a prior knowledge about it. This kind of prior uncertain information of Z 
can be described in the form of belief functions: 

plz{Z = j) =plj,j = 1,2, - • • ,p. (15) 

The likelihood of the complete data is: 

n 

L%9;Y,Z) = l[f{y„Zj-,9), (16) 

i=i 


and the pseudo-likelihood function is: 

Qi9,9'^)=Eg.[\ogL%9-,Y,Z)\Y*,plz;9% (17) 

where Egk[-\Y*,plz]9'^] denotes expectation with respect to the conditional dis¬ 
tribution of W given the observation Y* and the uncertain information plz- 


Theorem 1. For (jjj,Zj) are complete and censored, fyziVj, Zj\y*',9^) can be 
caleulated according to Eg. (HU and Eg. (HU) respectively. Let y* he the 
observation. If the sample is completely observed, yj = y*; Otherwise yj > y*. 


fYz(yj,z,\y*-,0'') = ky.=ynPi 


'k 

Ijz’ 


(18) 
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fvziyjy ) — Myj>y*}P2jz 

where and are shown in Eq. (|20)) . 


k fiyp^^z) 


nv*pez)‘ 


(19) 



where 


z\Y*-e) 


= z\y*-e>^) 


for the completely observed data 
for the censored data 

( 20 ) 


P,%{z,=z\y*;eA 
P^.Azj = z\y*AA 


E./( 2 /;;e)Ar 

Fiy*AA^'f 

EzFiyAA^)^A 


( 21 ) 

( 22 ) 


Proof. If (yj,Zj) are completely observed, 


/i,(j/„ z.ly*; 0'=) = P,%fiy,\y* = y„ = z; 9^, 


we obtain Eq. m- 

If (yj,Zj) are censored, 

fyAy„z,\y*;eA = P2%fiyM < y,, 0^), 

Erom the theorem in m, 




-/j 


{Vj >Vi } > 


we can get Eq. (HU). 

This completes this proof. 

From the above theorem, the pseudo-likelihood function can be written as: 


where 


Qi9, 9A = Eg. [log fAY, Z)\Y*,plz-, 9^ 


= [log X, + log f{y,\A)\YAplz;9A 

1=1 

= E e T.P2%iogx, 

VjGYi Z yj eI 2 


+ E EA'ziog/(y;i6) 

yj^Yi z 





(23) 


P^^Azj = z\y*,plzf,9A = PtjzAj = z\y*;9A ®plz,,i = 1 , 2 . 
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It can be seen that = z\y*,plzj',0’^) is a Dempster combination of the 

prior and the observed information. 

Assume that the data is from the mixed-Rayleigh distribution without loss 
of generality, the p.d.f. is shown in Eq. (El: 

fx{x; A, C) = ^ >^jgx{x; ^j) = ^ exp{-^^]x^}, (24) 

1=1 i=i 

After the iteration and 9^ = X^ is got, the (fc+1)*^ step of E2M algorithm 
is shown as follows: 


1. E-step: For j = 1, 2, • • ■ , n, z = 1, 2 • ■ • ,p, use Eq. (1^ to obtain the condi¬ 
tional p.d.f. of logL'^(0; W) based on the observed data, the prior uncertain 
information and the current parameters. 

2. M-step: Maximize Q{9\9^) and update the parameters: 




(C: 


fc-|-1^2 




\vi&Yi 



Ey^eYi 

jd' k 

z + Syj' 


R 


'k 


2jz 




+ ^2jziyj + 2/(Cz)^) 


(25) 


(26) 


It should be pointed out that the maximize of Q{0,9^) is conditioned on 
Sr=i Xi = 1. By Lagrange multipliers method we have the new objective func¬ 
tion: 

Q(d,d'=)-a(^A,-I). 

2=1 


4 Numerical results 

In this section, we will use Monte-Carlo method to test the proposed method. 
The simulated data set in this section is drawn from mixed Rayleigh distribution 
as shown in Eq. (l24)) with p = 3, A = (1/3,1/3,1/3) and ^ = (4,0.5,0.8). The 
test scheme is n = 500, m = n* 0.6, R = (0,0, • ■ • , n — m)ixm- Let the initial 
values be A° = (1/3,1/3,1/3) and = (4,0.5,0.8) — O.OI. As mentioned before, 
usually there is no information about the subclass labels of the data, which is the 
case of unsupervised learning. But in real life, we may get some prior uncertain 
knowledge from the experts or experience. These partial information is assumed 
to be in the form of belief functions here. 

To simulate the uncertainty on the labels of the data, the original generated 
datasets are corrupted as follows. For each data j, an error probability qj is 
drawn randomly from a beta distribution with mean p and standard deviation 
0.2. The value qj expresses the doubt by experts on the class of sample j. With 
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a. Estimation of b. Estimation of ^2 



P 


c. Estimation of ^3 


Fig. 1 . Average RABias values (plus and minus one standard deviation) for 20 repeated 
experiments, as a function of the error probability p for the simulated labels. 


probability qj, the label of sample j is changed to any (three) class (denoted by 
z*) with equal probabilities. The plausibilities are then determined as 


Plz^izj) 



+ 1- qj 


if Zj = z* 


(27) 


The results of our approach with uncertain labels are compared with the cases 
of noisy labels and no information on labels. The former case with noisy labels is 
like supervised learning, while the latter is the traditional EM algorithm applied 
to progressively censored data. In each case, the E2M (or EM) algorithm is run 
20 times. The estimations of parameters are compared to their real value using 
absolute relative bias (RABias). We recall that this commonly used measure 
equals 0 for the absolutely exact estimation 0 = 0. 
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a. Estimation of b. Estimation of ^2 



c. Estimation of ^3 

Fig. 2. Average RABias values (plus and minus one standard deviation) for 20 repeated 
experiments, as a function of the sample numbers n. 

The results are shown graphically in Figure [TJ As expected, a degradation of 
the estimation performance is observed when the error probability p increases 
using noisy and uncertain labels. But our solution based on soft labels does not 
suffer as much that using noisy labels, and it clearly outperforms the supervised 
learning with noisy labels. The estimations for and ^3 by our approach (un¬ 
certain labels) are better than the unsupervised learning with unknown labels. 
Although the estimation result for ^2 using uncertain labels seems not better 
than that by traditional EM algorithm when p is large, it still indicates that 
our approach is able to exploit additional information on data uncertainty when 
such information is available as the case when p is small. 

In the following experiment, we will test the algorithm with different sample 
numbers n. In order to illustrate the different behavior of the approach with 
respect to n, we consider a fixed censored scheme with (m =) 60% of sam¬ 
ples are censored. With a given n, the test scheme is as follows: m = n * 0.6, 
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R = (0, 0, • • ■ ,n — m)ixm- Let the error probability he p = 0.1. Also we will 
compare our method using uncertain labels with those by noisy labels and with¬ 
out using any information of labels. The RABias for the results with different 
methods is shown in Figure [21 We can get similar conclusions as before that 
uncertainty on class labels appears to be successfully exploited by the proposed 
approach. Moreover, as n increases, the RABias decreases, which indicates the 
large sample properties of the maximum-likelihood estimation. 

5 Conclusion 

In this paper, we investigate how to apply E2M algorithm to progressively cen¬ 
sored data analysis. From the numerical results we can see that the proposed 
method based on E2M algorithm has a better behavior in terms of the RABias 
of the parameter estimations as it could take advantage of the available data 
uncertainty. Thus the belief function theory is an effective tool to represent and 
deal with the uncertain information in reliability evaluation. The Monte-Carlo 
simulations show that the RABiases decreases with the increase of n for all cases. 
The method does improve for large sample size. 

The mixture distribution is widely used in reliability project. Engineers find 
that there are often failures of tubes or other devices at the early stage, but 
the failure rate will remain stable or continue to raise with the increase of time. 
Erom the view of statistics, these products should be regarded to come from 
mixed distributions. Besides, when the reliability evaluation of these complex 
products is performed, there is often not enough priori information. Therefore, 
the application of the proposed method is of practical meaning in this case. 
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